Variant Discovery ◾ 157
The VCF file format is the standard format for variant calling. The VCF file can be con-
verted into ANNOVAR input file by using “-format vcf4” argument.
Figure 4.17 shows the directory tree which includes “annovar” directory that contains
the ANNOVAR scripts and subdirectories and the “input” directory that includes the VCF
files (sarscov2.vcf and humanSNP.vcf) from the previous SARS-CoV-2 and human variant
calling examples. We copied them to this directory for simplicity. The following command
will convert “humanSNP.vcf” file into ANNOVAR input format “humanSNP.avinput”:
convert2annovar.pl \
-format vcf4 input/humanSNP.vcf \
> input/humanSNP.avinput
Figure 4.18 shows the ANNOVAR input file, which includes the first five essential columns
and additional three columns.
For converting other variant calling file formats, run “convert2annovar.pl -h”. This
command is also used with “-dbSNP” option to add the dbSNP accessions.
Variant annotation with ANNOVAR:
The “annotate_variation.pl” script is the core program for ANNOVAR annotation. It
requires ANNOVAR input file. However, “table_annovar.pl” script is also used for annota-
tion and it takes a VCF file as input.
./annotate_variation.pl \
-out ../output/humanSNPannot \
FIGURE 4.17 The directory tree of the ANNOVAR.
FIGURE 4.18 ANNOVAR input file.